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DHA SEQUENCE ENCODING BOVINE a-LACTALBUHIN 
AND METHODS OF USE 

FIELD OF THE INVENTION 
The present invention relates generally to a 
5 DNA sequence encoding bovine a-lactalbuitiin and to methods 

of producing proteins including recombinant proteins in 
the milk of lactating genetically engineered or 
transgenic mammals. The present invention relates also 
to genetically engineered or transgenic mammals that 
10 secrete the recombinant protein. The present invention 

is also directed to a genetic marker for identifying 
animals with superior milk producing characteristics. 

REFERENCE TO CITED ART 
Reference is made to the section preceding 
15 the CLAIMS for a full bibliography citation of the art 

cited herein. 

DESCRIPTION OF THE PRIOR ART 

a-Lactalbumin is a major whey protein found 
in cow's milk ( Eiael et al. . 1984). The term "whey 

20 protein" includes a group of milk proteins that remain 

soluble in "milk serum" or whey after the precipitation 
of casein, another milk protein, at pH 4.6 and 20**C. a- 
Lactalbumin has these characteristics. 

a-Lactalbumin is a secretory protein that 

25 normally comprises about 2.5% of the total protein in 

milk. a*Lactalbumin has been used as an index of mammary 
gland function in response to hormonal regulation in 
bovine explant culture ( Akers et al. . 1981; Goodman et 
al. > 1983) and as an index of udder development ( McFadden 

30 et al . , 1986) • a-Lactalbumin interacts with galactosyl 

transferase and therefore plays an essential role in the 
biosynthesis of milk sugar lactose ( Brew, K. and R.L. 
Hill . 1975) • Lactose is an important component in milk, 
and contributes to milk osmolality. It is the most 

35 constant constituent in cow*s milk ( Larson . 1985) • a- 

Lactalbumin is useful as an index of lactogenesis in 
cultxired mammary tissue ( McFadden et al. . 1987) . It is 
therefore- believed that a*-Lactalb\imin is an important 
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protein in controlling milk yield and can be used as an 
indicator of mammary function. 

The expression of bovine a-lactalbumin may be 
a potential rate limiting process in dairy cattle. If 
5 greater expression of the a-lactalbumin gene can be 

obtained, then more milk and milk protein could be 
produced. In other words, a-lactalbumin is a potential 
Quantitative Trait Locus (QTL) . 

SUMMARY OF THE INVENTION 

10 One object of the present invention is to 

detect possible genetic differences in the expression of 

bovine cx-lactalbumin. 

Another object of the present invention is to 

provide a DNA sequence encoding a mammary specific bovine 
15 a-lactalbumin protein having a specified nucleotide 

sequence^ 

It is also an object of the present invention 
to provide a method for genetically engineering the 
incorporation of one or more copies of a construct 
20 comprising an a-lactalburain control region, which 

construct is specifically activated in the manuneuiy 
tissue. 

These objects and others are addressed by the 
present invention, which is directed to a DNA sequence 
25 encoding bovine a-lactalbumin having a specified 

nucleotide sequence. 

The present invention is also directed to an 
expression vector comprising this DNA secjuence. Further, 
the present invention is directed to the protein a- 
30 lactalbiamin having the nucleotide sequence. 

The present invention is also directed to an 
expression system comprising a mammary specific a- 
lactalbumin control region which, when genetically 
incorporated into a meunmal, permits the female species of 
35 that mammal to produce the desired recombinant protein in 

its milk. 
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The present invention is also directed to a 
genetically engineered or transgenic mairanal comprising 
the specified DNA sequence encoding bovine a-lactalbumin. 

The present invention is also directed to a 
5 DNA sequence coding for a-lactalbumin, which is 

operatively linked to an expression system coding for a 
maiQinary-specif ic a-lactalbumin protein control, or any 
control region which specifically activates a-lactalbumin 
in milk or in mammary tissue, through a DNA sequence 
10 coding for a signal peptide that permits secretion and 

maturation of the a-lactalbumin in the mammary tissue. 

The present invention is also directed to a 
process for genetically engineering the incorporation of 
one or more copies of a construct comprising an a- 
15 lactalbumin control region which specifically activates 

a-lactalbumin in milk or in mammary tissue. The control 
region is operatively linked to a DNA sequence coding for 
a desired recombinant protein through a DNA sequence 
coding for a signal peptide that permits the secretion 
20 and maturation of a-lactalbumin in the mammary tissue. 

The present invention is also directed to a 
process for the production and secretion into a mammal's 
milk of an exogenous recombinant protein. The steps 
include producing milk in a genetically engineered or 
25 transgenic mammal. The milk is characterized by an 

expression system comprising a-lactalbumin control 
region. The control region is operatively linked to an 
exogenous DNA sequence coding for the recombinant protein 
through a DNA sequence coding for a signal for the 
30 peptide effective in secreting and maturing the 

recombinant protein in mammary tissue. The milk is then 
collected for use. Alternatively, the exogenous 
recombinant protein is isolated from the milk. 

The present invention is also directed to a 
35 selection characteristic for identifying superior milk 

and milk protein producing animals comprising a DNA 
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sequence encoding bovine a-lactalbuitiin and having a 
specified nucleotide sequence. 

The present invention is also directed to a 
selection characteristic for identifying superior milk 
5 and milk protein producing mammals. The mammals are 

characterized by inherited genetic material in the DNA 
structure of the mammal. The genetic material encodes at 
least one desired dominant selectable marker for bovine 
a-lactalbumin» One such marker is adenosine, which is 

10 located at the -13 position on the control region of the 

DNA sequence for a-lact albumin. The present invention is 
also directed to a method of predicting superior milk and 
milk protein production in animals comprising identifying 
the selection characteristic discussed above. 

15 The present invention is further directed to 

a method -for modifying the milk composition in mammals 
which comprises inserting a DNA sequence encoding bovine 
a-lactalbumin having a specified nucleotide sequence. 

The DNA sequence and the various methods of 

20 using it have potentially beneficial uses for dairy 

farmers, artificial insemination organizations, genetic 
marker companies, and embryo transfer and cloning 
companies, to name a few. 

The uses for this genetic marker include the 

25 identification of superior nuclear transfer embryos and 

the identification of superior embryos to clone. 

The present invention also will aid in the 
progeny testing of sires. The specified DNA sequence can 
be used as a genetic marker to identify possible elite 

30 sires in terms of milk production and milk protein 

production. This will increase the reliability of buying 
superior dairy cattle. 

The present invention also will provide 
assistance in farm management decisions, such as sire 

35 selection and selective culling. The physiological 

markers assist in determining future production 
performance in addition to a cow's pedigree. From this 
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information, one could buy or retain a heifer with a DNA 
sequence encoding a-lactalbuinin of the present invention 
and consider culling a heifer without the proper 
sequence • 

5 BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a schematic illustration of a 
partial restriction map of the bovine a-lactalbumin of 
the present invention. The sequence contains 2.0 
kilobases of a 5» flanking region, 1.7 kilobases of a 
10 coding region and 8.8 kilobases of a 3 ' flanking region. 

Digestion with the Hpa I yields a 2.8 kilobase fragment 
containing the whole 5* flanking region. 

Fig, 2 depicts in schematic outline a map of 
the plasmid A-lac Pro/pIC 20R. A Hpa I fragment of the 
15 genomic clone was inserted into the EcoRV site of pic 

2 OR. The Hpa I fragment contains 2.1 kb of 5' flanking 
DNA the signal peptide coding region of a-lactalbumin and 
8 bases encoding the mature a-lactalbumin protein. Six 
unique enzyme sites are available for attaching various 
20 genes to the sequence. 

Fig. 3 is a schematic illustration of a 
detailed map of the a-lactalbumin 5' flanking control 
region cloned in EcoRV site of the plasmid pIC 2 OR (SEQ 
ID N0:l, SEQ ID N0:2). 
25 Fig. 4 is a schematic illustration of a 

detailed map of the 8.0 kilobase Bglll fragment. 

Fig. 5 depicts the nucleotide sequence (SEQ 
ID NO: 3) of the control /enhancer region of the bovine a- 
lactalbumin protein. 
30 Fig. 6 depicts in schematic outline a map of 

a plasmid containing bovine a-lactalbumin-bovine fi-casein 
gene construct. 

Fig. 7 illustrates a sequence comparison 
between humans and bovine genes in the 5' flanking region 
35 of the bovine a-lactalbtimin protein between the present 

invention U. S. bovine sequence (SEQ ID NO: 4), a human 
sequence (SEQ ID NO: 5) and the French bovine (SEQ ID 
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NO: 6) for the putative steroid response element and 
between the present invention U. S. bovine sequence (SEQ 
ID NO: 7), a human sequence (SEQ ID NO: 8) and the French 
bovine (SEQ ID NO: 9) for the RNA polymerase binding 
5 region, surrounding three of the four nucleotide sequence 

variant mutations. 

Fig. 8 is a DOTPLOT'*' graph comparing the 
bovine a-lactalbumin 5' flanking sequence to the same 
region of the human a-lactalbumin sequence. 
10 Fig. 9 is a DOTPLOT'" graph comparing the 

bovine a-lactalburain 5' flanking sequence to the same 
region of the guinea pig a--lactalbumin sequence. 

Fig. 10 is a DOTPLOT™ graph comparing the 
bovine a-lactalbumin 5' flanking sequence to the same 
15 region of the rat a-lactalbumin sequence. 

Fig. 11 is a graph illustrating expression 
levels observed in each of three a-lactalbumin transgenic 
mouse line. 

Fig. 12 is a 4% NuSieve autoradiographic gel 
20 of Mnll digested PGR products. 

Fig. 13 is a graph illustrating a scatter 
plot of each data point in Fig. 12 as well as mean values 
for each of the three genotypes. 

DETAIL DESCRIPTION OF THE PREFERRED INVENTION 
25 In the Description the following terms are 

employed: 

Genetic engineering, manipulation or 
modification: the formation of new combinations of 
materials by the insertion of nucleic acid molecules 

30 produced outside the cell into any virus, bacterial 

plasmid or other vector system so as to allow their 
incorporation into a host organism in which they do not 
naturally occur, but in which they are capable of 
continued propagation at least throughout the life of the 

35 host organism. Although the term incorporates transgenic 

alteration, the manipulation of the genomic sequence does 
not have to be permanent, i. e., the genetic engineering 
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can affect only the animal which was directly 
manipulated. 

Transgenic animals: permanently genetically 
engineered animals created by introducing new DNA 
5 sequences into the germ line via addition to the egg. 

It is within the scope of the present 
application to use any mammal for the invention. 
Examples of mammals include cows, sheep, goats, mice, 
oxen, camels, water buffaloes, llamas and pigs. 

10 Preferred mammals include those that produce large 

volumes of milk and have long lactating periods. 

The present invention is directed to a gene 
which encodes bovine a-lactalbumin. This gene has been 
isolated and characterized. The 5' flanking region of 

15 the gene has been cloned into six vectors for use as a 

mammary specific control region in the production of 
genetically engineered mammals. To better understand the 
regulation of this control region, 2.0 kilobases of the 
5" flanking sequence have been sequenced. The a- 

20 lactalbumin 5» flanking sequence serves as a useful 

mammary- specif ic "control/ enhancer complex" for 
engineering genetic constructs that could be capable of 
driving the expression of novel and useful proteins in 
the milk of genetically engineered or transgenic mammals. 

25 This results in an increase in milk production and the 

protein composition in milk, a change in the milk and/or 
protein composition in milk, and the production of 
valuable proteins in the milk of genetically engineered 
or transgenic mammals. Such proteins include insulin, 

30 growth hormone, growth hormone releasing factor, 

somatostatin, tissue plasminogen activator, tumor 
necrosis factor, lipocortin, coagulation factors VIII and 
IX, the interferons, colony stimulating factor, the 
inter Ixikens, urokinise, industrial enzymes such as 

35 cellulases, hemicellulases, peroxidases, and themnal 

stable enzymes. 
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The a-lactalbumin gene is the preferred gene 
for use in the process because it is a mammary specific 
protein 5' control region. It also exerts the tightest 
lactational control of all milk proteins. Further, it is 
5 independently regulated from other milk proteins and is 

produced in large quantity by lactating animals. 
Total Sequence 

A gene encoding the milk protein bovine 
a-lactalbumin was isolated from a bovine genomic library 
10 (Woychik, 1982) • The Charon 28 lambda library was probed 

using a bovine a-lactalbumin cDNA (Hurley, 1987) and a 
770 base pair a-lactalbumin polymerase chain reaction 
product. The positive lambda clone includes 12.5 
kilobases of inserted bovine sequence, consisting of 2.0 
15 kilobases of a 5 ' flanking (control/ enhancer) region, a 

1.7 kilobase coding region and 8.8 kilobases of a 3 ' 
flanking region. A partial restriction map of the clone 
is illustrated in Fig. 1. 

A 2.8 kilobase Hpa I fragment including the 
20 2.0 kilobase control region along with the signal peptide 

coding region was cloned into the EcoRV site of the 
plasmid pIC 2 OR. The plasmid is illustrated in schematic 
outline in Fig. 2. 

An 8.0 kilobase Bgl II fragment containing a 
25 2.0 kilobase 5» flanking control region, a 1.7 kilobase 

coding region, 3.0 kilobases of a 3 • flanking region, 1.2 
kilobases of a lambda DNA has also been isolated. 
Reference is made to figure 4 for a map of the 8.0 
kilobase fragment. Transgenic mice have been produced 
30 using the Bgl II fragment. 

Control /Enhancer Region 

The 2.0 kilobase 5* flanking region has been 
cloned into the vectors Pic 2 OR and Bluescript KS+. A 
schematic illustration of the a-lactalbumin 5» flanking 
35 control region cloned in the EcoRV site of pic 20R is 

depicted in Figs. 2 and 3 (SEQ ID N0:1, SEQ ID N0:2} . 
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The construct <s multiple cloning site, which 
exists downstream of the signal peptide coding region, 
permits various genes to be attached to the a-lactalbumin 
control region. Thus, this vector allows for easy 
5 attachment of specific coding sequences of genes. It 

contains all elements necessary for expression of 
proteins in milk, i.e., a mammary specific control 
region, a mammary specific signal peptide coding region 
and a mature protein-signal peptide splice site which is 

10 able to be cleaved in the mammary gland. The vector also 

contains many unique restriction enzyme sites for ease of 
cloning. Attachment of genes to this control region will 
allow for mammary expression of the genes when these 
constructs are placed into mammals. These vectors also 

15 contain the a-lactalbumin signal peptide coding sequence 

which will allow for proper transport of the expressed 
protein into the milk of the lactating mammal. 

The control region construct has driven 
mammary expression of a desired protein in transgenic 

20 mice. Bovine a-lactalbumin levels of greater than 1 

mg/ml have been observed in the milk of transgenic mouse 
lines as described in Example 2 ( infra . ) . Constructs 
containing the 2.0 kilobase region attached to the bovine 
B-casein gene (Bonsing, J., et al., 1988) as well as the 

25 bacterial reporter gene chloramphenicol acetyl 

transferase have been produced in our lab. Fig. 6 is a 
schematic representation of a plasmid containing the 
bovine a-lactalbumin bovine B-casein gene construct. The 
genomic DNA sequence containing the bovine fi-casein gene 

30 was attached to the 5* flanking sequence of the bovine a- 

lactalbumin 5' flanking sequence. The vector contains 
the polyadenylation site of B-casein along with 
approximately 100 base pairs of 5« flanking DNA. The 100 
base pairs of 5' flanking DNA is attached to the bovine 

35 a-lactalbumin 5" flanking region at the -100 position. 

The construct uses the proximal promoter elements of fi- 
casein and the distal control region elements of a- 
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lactalbiiinin. The B-casein construct has been used to 
produce transgenic mice as is illustrated in the 
examples • 

To understand the control of the 
5 control /enhancer region the 2.0 kilobases of 5' flanking 

region were sequenced. A single strand copy of the 
sequence is listed in Fig. 5 (SEQ ID N0:3)- The sequence 
is listed 5' to 3« with the signal peptide coding region 
underlined. 

10 Regulatory Sequences 

Potential regulatory sequences contained 
within the 5 '-flanking region of bovine a-lactalbumin 
have been identified- There are possible regulatory 
regions in the introns as well as in the 3 ' flanking 

15 region. Portions of the suspected control regions were 

examined- for possible sequence dif f ere.nces in the 
population which might be related to milk and milk 
protein production of individual cows. The differences 
in the regulatory regions of a-lactalbumin are expected 

20 to lead to differences in expression of a-lactalbumin 

mRNA. The increased cellular content of mRNA will 
increase the expression of a-lactalbumin protein with a 
concomitant increase in lactose synthase resulting, 
ultimately, in a milk and milk protein production 

25 increase. This type of mechanism would be considered a 

major gene effect on milk and milk protein production by 
a-lactalbumin. The changes are viewed as causally-linked 
to changes in milk and milk protein production and not 
correlatively-linked. Correlatively-linked traits are 

30 those which are closely associated with an unknown 

genetic loci which has the direct impact on the 
quantitative trait. 

Sequence differences between the U. S. 
Holstein and the French cow ( Vilotte. et al. , 1987) of an 

35 unknown breed were found at four positions within the 5" 

flanking region. One of the identified sequences has a 
sequence which would indicate that it was a steroid 
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hormone response element. Two other differences were 
noted in the RNA polymerase binding region and a fourth 
in the signal peptide coding region of the gene. Because 
of the relationship between these sequences and known 
control sequences of mammalian genes, all the variations 
occur in regions one would expect to be involved in 
regulation of the amount of mRNA produced. Further, 
genetic variations which occur in factors binding to 
these regions would also be expected to cause changes. 

Fig. 7 illustrates sequence variants observed 
in the 5' flanking region between the present invention 
U. S. bovine, human (Hall et al., 1987) and the French 
bovine (Vilotte, 1987) for the putative steroid response 
element (SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 
respectively) and for the RNA polymerase binding region 
(SEQ ID NO: 7, SEQ ID NO: 8, and SEQ ID NO: 9 respectively). 
All of the differences occur in highly conserved portions 
of the gene as seen by comparing this region to the same 
region of the human a-lactalbumin gene. Fig. 7 also 
shows that the positions where the bovine genes differ 
are the same positions the human gene differs from the 
bovine. These data indicate that the bases are part of a 
potentially important control region. 

A method has been devised to give a clearcut 
differentiation between two of the variants at a position 
-13 bases from the start of transcription, i. e., 13 base 
positions from the signal peptide coding region. The two 
variants are termed (o-Lac (-13) A) and (a-Lac (-13) B) . 
The a-lac (-13) A genotype is adenine base at position 
-13 the a-lac (-13) B genotype is either a guanine, 
thymine or cytosine base at -13. They can be 
differentiated with a simple restriction enzyme digest of 
an amplified polymerase chain reaction (PGR) product 
using a specific restriction enzyme (Mnll) . Because of 
the specificity of the restriction enzyme Mnll, the 
restriction analysis is unable to distinguish between 
these different possibilities. The a-lac (-13) A allele 
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contains an extra Mnll site at position -13 giving the 
smaller band observed on the gel. 

To amplify the appropriate region of DNA, 
oligonucleotides which frame the sequence of interest 
5 were synthesized. These oligonucleotides were chosen 

because of their specific chemical characteristics. 
These oligonucleotides were then used in a polymerase 
chain reaction to amplify the framed portion of the 
a-lactalbumin gene. The oligonucleotides have the 
10 following sequences: 

a-lac Seq. 1 (SEQ ID NO: 10) 

5 'ACGCTTGTAAAACGACGGCCAGTTGATTCTCAGTCTCTGTGGT 3* 
a-lac Seq. 2 (SEQ ID NO: 11) 

5 • AGCATCAGGAAACAGCTATGACCTGGGTGGCATGGAATAGGAT 3 • 
15 Restriction fragment analysis (Sambrook, J. 

et al., 1989) was used to examine animals from a number 
of breeds of cattle. In most breeds, namely, Jersey, 
Guernsey, Brown Swiss, Simmental and Brahman, only one of 
two genotypes is found. This is the a-lac (-13) B 
20 genotype. However, in the most popular and highest milk 

producing breed of cattle, the Holstein, two genotypes 
occur at this position. The frequency of the A genotype 
was 27% in random samples, while the frequency of the B 
genotype was 73%. Holsteins contain both the genotype 
25 found in the other breeds as well as a separate distinct 

genotype which appears to have arisen within the last 
thirty years in the U. S. Holstein population as 
determined by examining pedigrees of sires currently in 
use. It appears that this genotype has unknowingly been 
30 selected for using traditional animal selection. 

Homozygous and heterozygous animals are found within the 
Holstein population. 

The genotype (a-lac (-13)) has been examined 
for its correlation with milk and milk protein 
35 production. The three additional variations are being 

examined to determine the frequency of their differences 
in the cattle population and their correlation with milk 
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and milk protein production. The possible linkage of 
these genotypes is also being examined using DNA 
sequencing. The goal of this technology is to identify 
the optimal regulatory genotype for a-lactalbumin and to 
5 select animals with those particular characteristics. 

Detection and Sel ection of Four Genetic Variants 

The region of sequence where the a-lac (-13) 
variation occurs can be amplified using the polymerase 
chain reaction (PCR) (Sambrook et al., 1989) and two of 
10 the following primers which were developed. Each primer 

allows for amplification of a specific portion of the 
a-lactalbumin gene. Combinations of the listed primers 
can be used in between any two of the primer locations 
listed below. 

15 Primer No.\ Primer sequence Primer location 

(SEQ ID NO:) (From translation 

start site) 

1 \ (12) 5' CTCTTCCTGGATGTAAGGCTT 3» (-120) - (-100) 

2 \ (13) 5' TCCTGGGTGGTCATTGAAAGGACT 3 ' (-2000) -(-1975) 
20 3 \ (14) 5' CAATGTGGTATCTGGCTATTTAGTG 3* (-717 )- (-692) 

4 \ (15) 5' AGCCTGGGTGGCATGGAATA 3" (+53)-(+33) 

5 \ (16) 5' GAAACGCGGTACAGACCCCT 3« (+453 ) - (+433 ) 

After amplification of the specific region, 
the DNA is either sequenced or digested with restriction 

25 enzymes to detect the sequence differences. In the case 

of the a-lac (-13) variation, the sequence difference can 
be seen using the restriction enzyme Mnll (S^CTCC 3» 
recognition site) • The PGR DNA product is digested with 
Mnll and then run on a 4% NuSieve agarose gel to observe 

30 the polymorphism. 

A 650 base pair sequence containing all four 
of the variations is being examined using a unique 
sequencing technique. PCR is initially used to amplify a 
770 base pair portion of the a-lactalbumin 5» flanking 

35 region. Another PCR reaction is then perfoinaed using a 

portion of the initial reaction and the following primers 
(SEQ ID NO: 10 and SEQ ID NO: 11 respectively): 

a-lac aeq. 1 5*ACGCTTGTAAAACGACG6CCAGTTGATXCTGAGTCTCTGTGGT 3* 
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a-lac seq. 2 5 ' AGCATCAGGAAACAGCTATGACCTGGGTGGCATGGAATAGGAT 3' 

The primers listed above contain a portion of 
the a-lactalbumin gene as well as both M13 DNA sequencing 
primers. The primers are designed to allow for DNA 
5 sequencing to be performed in both directions on the PGR 

DNA product. The final PGR product will contain the 
region of a-lactalbumin containing the four genetic 
variants, the two M13 sequencing priming regions and 5 
"dummy bases" on the end to aid in the M13 primer 
10 binding- 

Comparison of Hiahlv Conserved Portions of the 5' 
Flanking Region of g-Lact albumin Between Species 

Reference is made to Figs. 8-10 for 
DOTBLOT" graphs comparing the bovine a-lactalbumin 
15 sequence to the same region of the human (Fig. 8) , guinea 

pig (Fig. 9) , and rat (Fig. 10) . The region in Fig. 8 
(human) spans 819 base pairs. The sequences are highly 
conserved to about 700 base pairs. The region in Fig. 9 
(guinea pig) spans 1381 base pairs. The sequences are 
20 highly conserved to about 700 base pairs, but then 

diverge. The region in Fig. 10 (rat) spans 1337 base 
pairs, The sequences are highly conserved to about 700 
base pairs, but then diverge. Species differences in 
control regions would be expected to occur in non- 
25 conserved regions of the sequence. 

Comparison of 5' Flanking Region o f Bovine a-Lactalbumin 
to Other Bovine Milk Protein Genes 

Portions of the 5' flanking region of the 
other bovine milk protein genes (asl and as2 casein, 
30 B-casein, ic-casein and B-lactoglobulin) which are highly 

conserved with the a-lactalbumin 5» flanking region were 
identified. It is probable that sequence differences 
within these regions will also have an effect on mRNA 
production as well as final protein production. Two 
35 examples of these highly homologous regions are listed 

below. 

The bovine a-lactalbumin sequence from (-161) 
- (-115) (SEQ ID NO: 17) compared to the bovine B-caseir» 
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sequence (SEQ ID NO: 18) corresponding to the same region 
of the gene. Percent similarity is 69% over 46 bases. 
AGGAAGCTCAATGTTTCTTTGTTGGTTTTACTGGCCTCTCTTGTCA 

till 111 I t MM II I I I I I I I I I M II Ml 
I I II Ml I 1 MM M I I I 1 M M I II II III 

5 AGGAGGCT . ATTCTTTCCTTTTAGTCTATACTGTCTTCGCTCTTCA 

The bovine a-lactalbumin sequence (SEQ ID 
NO: 19) from (-1420) - (-1351) is compared to the bovine 
fl-casein sequence (SEQ ID NO: 20) corresponding to the 
same region of the gene. Percent similarity is 75% over 
10 69 bases. 

TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCA 

I I M M I I M MM I M I i M M M I 

I I I M M M I MM I II I II I M M I 

TCTCAGAAATCACACTTTTTTGCCTGTG GCCTTGGCA 



15 M II M M M I M r 

-^-^ M M M M M I M I 



TACCAGAAGCTAACAGCTA 

MM M M M I M r I 
MM M I M M M M 

ACCAAAAGCTAACACATA 



The included data indicate that the bovine 
o-lactalbumin gene will be useful as selection tool in 
the dairy cattle industry as veil as a valuable 

20 control /enhancer and gene to be used in the field of 

genetically engineered mammals. The control region we 
have cloned contains the necessary regulatory elements to 
express genes in the milk of genetically engineered 
mammals as well ati the "high expressing genotype" as 

25 shown by our milk and milk protein production and 

sequence variation data. These facts make this a useful 
gene in both industrial and research areas. Application 
of these techniques to the other milk proteins will allow 
for the selection of valuable genotypes corresponding to 

30 the B-casein, osjl- and as2-casein and JC-casein genes and 

the fi-lactoglobulin genes. 
Cogj^pq Region 

The coding region of the a-lactalbumin 
protein includes a 1.7 kilobase sequence. 

35 3« nanl^inq Region 

The 3» flanking region is an 8.8 kilobase 
flanking region downstream of the DNA sequence coding for 
the desired recombinant protein. This region apparently 
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stabilizes the RNA transcript of the expression system 
and thus increases the yield of desired protein from the 
expression system. 
Operation 

5 The above-described expression systems may be 

prepared by methods well-known in the art. Examples 
include various ligation techniques employing 
conventional linkers, restriction sites, etc. 
Preferably, these expression systems are part of larger 
10 plasmids . 

After isolation and purification, the 
expression systems or constructs are added to the gene 
pool which is to be genetically altered- 

The methods for genetically engineering 

15 mammals are well-known to the art. Reference is made to 

to Alberts, B- et al. , 1989 and Lewin, B. 1990, for 
textbook descriptions of genetic engineering and 
transgenic alteration of animals. Briefly, genetic 
engineering involves the construction of expression 

20 vectors so that a cDNA clone or genomic structure is 

connected directly to a DNA sequence that acts as a 
strong promoter for DNA transcription. By means of 
genetic engineering, mammalian cells, such as mammary 
tissue, can be induced to make vast quantities of useful 

25 proteins. 

For the purposes of this invention, the term 
"genetic engineering," as defined supra . in the list of 
definitions, includes single line alteration, i. e. , 
genetic alteration only during the life of the affected 

30 animal with no germ line permanence. The construct can 

be genetically incorporated in mammalian glands such as 
meunmary glands and mammalian stem cells. 

Genetic engineering also includes transgenic 
alteration, i. e. the permanent insertion of the gene 

35 sequence into the genomic structure of the affected 

animal and any offspring. Transgenically altering a 
mammal involves microinjecting a DNA construct into the 
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pronuclei of the fertilized mammalian egg to cause one or 
more copies of the construct to be retained in the cells 
of the developing mammal. In a transgenic animal, the 
engineered genes are permanently inserted into the germ 
5 line of the animal. 

The genetically engineered mammal is then 
characterized by an expression system comprising the a- 
lactalbumin control region operatively linked to an 
exogenous DNA sequence coding for the recombinant protein 

10 through a DNA sequence coding for a signal peptide 

effective in secreting and maturing the recombinant 
protein in mammary tissue. In order to produce and 
secrete the recombinant protein into the mammal's milk, 
the transgenic mammal must be allowed to produce the 

15 milk, after which the milk is collected. The milk may 

then be used in standard manufacturing processes. The 
exogenous recombinant protein may also be isolated from 
the milk according to methods known to the art. 
Selection Characteristics 

20 The a-lactalbumin control/ enhancer sequence 

of Fig. 1 is also important as a selection characteristic 
for identifying superior or elite milk producing mammals. 
Presently, those in the dairy cattle business can only 
rely on pedigree information, which is frequently not 

25 available, to predict milk and milk protein production in 

mammals, specifically the bovine species. The study of 
physiological markers as a means for determining milk and 
milk protein production has received some interest. The 
most common physiological marker traits studied in dairy 

30 cattle are hormones, enzymes, and different blood 

metabolites. Components of the immune system have also 
been studied. Traits listed as possible marker traits 
for milk yield include thyroxine, blood urea nitrogen, 
growth hormones, insulin-like growth factors and insulin, 

35 and glucose and free fatty acids. Iftiile these technic[ues 

have shown some advances in predicting milk and milk 
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protein production in a dairy animal, there is currently 
no other reliable means to predict these characteristics. 

The present invention provides a selection 
characteristic for identifying superior milk and milk 
5 protein-producing mammals comprising inherited genetic 

material which is DNA occurring in the genetic structure 
of the mammal in which the genetic material encodes a 
dominant selectable marker for bovine a-lactalbumin. 

The DNA sequence disclosed herein serves as a 
10 characteristic marker for elite milk producing mammals. 

The examples below describe the invention 
disclosed herein, although the invention is not to be 
understood as limited in any way to the terms and scope 
of the examples. 
15 EXAMPLES 

Example 1: a-lac (-13) variation study. 

Forty- two mammals were selected in a 
stratified random manner to provide mammals of a wide 
range of milk and milk protein production capabilities 
20 within the UW herd. 

DNA was isolated according to procedures 
known to the art from a random sample of 42 Holstein 
dairy cows in the University of Wisconsin-Madison herd. 
Each mammal was genotyped as described previously for the 
25 a-lactalbumin (-13) variation using a 4% NuSieve gel of 

Mnll digested PGR products. 

The gene frequency in this population is 28% 
for the a-lac (-13) A and 72% for the a-lac (-13) B. 
Each of the distinct genotypes are shown on the gel in 
30 Fig. 12. The legend for the gel of Figure 12 is as 

follows: 

Lane 1 Molecular Weight Standards 

Lane 2-3 heterozygous a-lac (-13) AB 

Lane 4: homozygous a-lac (-13) BB 

35 Lane 5 heterozygous a-lac (-13) AB 

Lane 6 homozygous a-lac (-13) BB 

Lane 7 homozygous a-lac (-13) AA 
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Lane 8 heterozygous ct-lac (-13) AB 

Analysis of the genetic capabilities of the 
42 manmals indicates a possible major gene effect caused 
by the a-lac (-13) allele or linked to the a-lac (-13) 
5 allele. A scatter plot of each data point as well as 

laean values for each of the three genotypes is 
illustrated in Fig. 13. Holstein cows were compared 
using their predicted transmitting ability for milk. 

The data indicate that the a-lac (-13) A 

10 genotype is the preferred genotype for milk and milk 

protein production. Table 1 shown below indicates the 
statistical association of differences in milk and milk 
protein production ability observed between each of the. 
genotypes for the traits listed below. Analysis of 

15 variance and T tests (LSD) were performed on the data. 

All of the production yield traits were positively 
correlated with the a-lac (-13) A allele. Milk protein 
percentage was negatively correlated to the a-lac (-13) A 
allele. 

20 Table 1 

Trait/Genotype Genotype 

a-lac (-13) AA a-lac (-13) AB a-lac (-13) BB 

PTA (Milk) /AA N.S. p<0.02 

PTA (Milk)/AB N.S. p<0.02 

25 ME305 (Milk)/AA N.S. N.S. 

ME305 (Milk)/AB N.S. p<0.1 

PTA (Protein #)/AA N.S. N.S. 

PTA (Protein i) /KB N.S. p<0.1 

PTA (Protein %)/AA N.S. p<0.01 

30 PTA (Protein %)/AB N.S. p<0.01 

Example 2. Production of Transgenic mice to study the 
regulation of bovine a-lactalbumin gene expression. 
Genomic Llbrarv Screening: 

The gene encoding the milk protein bovine 
35 a-lactalbumin was isolated from a bovine genomic library 

(Woychik, 1982) . The genomic library was screened 
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according to the following procedure. Approximately 1.5 
million lambda plaques were transferred to nylon 
membranes using procedures described by Maniatis et al. 
(1989). The a-lactalbumin cDNA (Hurley, 1987) or a 770 
5 base pair PGR product was nick translated (BRL) with 

a-P32 labeled dCTP. Blots were prehybridized overnight 
(65C) then hybridized for 16 hours at 65C. Blots were 
washed (Twice in 2X SSC 1% SDS, Once in O.IX SSC 0.1% 
SDS) at 65C and placed on Kodak X-OMAT film for 

10 autoradiography. A 8.0 kilobase fragment containing the 

a-lactalbumin gene was purified as illustrated in Fig. 4. 
The 8.0 kilobase fragment contained 2.1 kilobases of 5' 
flanking region, the 1.7 kilobase coding region and 2.6 
kilobases of 3' flanking region. 

15 Production of transgenic mice: 

Mature C57B6 X DBA2J Fl (B6D2) female were 
superovulated (PMSG and hCG) and mated with ICR or B6D2 
males to yield fertilized eggs for pronuclear 
microinjection. The eggs were micro injected using a 

20 Leitz micromanipulator and a Nikon inverted microscope. 

Forty normal appearing two cell embryos were transferred 
to each pseudopregnant recipient. (University of 
Wisconsin-Madison Biotechnology Center Transgenic Mouse 
Facility, Dr. Jan Heideman) . 

25 Screening of transgenic mice u sing PCR; 

Tail DNA was extracted using the method 
described by Constantini et al. (1986). Polymerase chain 
reaction (PCR) was performed using 10 ml lOx PCR reaction 
buffer (Promega Corp., Madison, WI.), 200 mM each dNTP 

30 (Pharmacia Intl., Milwaukee, WI.), 1.0 /im each primer 

(upstream primer 25mer -712 to -687 (5» 

CAATGTGGTATCTGGCTATTTAGTG 3») (SEQ ID NO: 14), downstream 
primer 20mer +39 to +59 (5* AGCCTGGGTGGCATGGAATA 3') (SEQ 
ID NO: 15), 1 unit Tag DNA polymerase (Promega Corp., 
35 Madison, WI.) and img genomic DNA. Volume was adjusted 

to 100 ml with double distilled sterile water and 
reaction was overlaid with heavy mineral oil. Samples 
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were subjected to 30 cycles (94C 2 min., 50C 1.5 inin, , 
720 1.5 min.)- Products were run in an 1% agarose gel 
and stained with ethidium bromide. 
Mouse Milking; 

5 The mice were separated from their litters 

for four hours and then anesthetized (0.01 ml/g body 
weight I. P. injection of 36% propylene glycol, 10.5% 
ethyl alcohol (95%), 41.5% sterile water, and 12% sodium 
pentabarbitol (50 mg/ml) ) . After being anesthetized the 

10 mice were injected I.M. with 0.3 I.U. oxytocin and milked 

using a small vacuum milking machine. Three of fifty-one 
live offspring were identified as being transgenic using 
polymerase chain reaction. Reference is made to Fig. 14 
for a graph illustrating expression levels observed in 

15 each of the 3 a-lactalbumin transgenic mouse line. 

ELISA: 

Second generation mammals from one line were 
milked and analysis was performed using an ELISA (enzyme 
linked immunosorbent assay) for bovine a-lactalbumin 
20 according to the following procedure: 

1. Coat 1/4 Ok bovine a-lactalbumin 
antiserum 100 ml per well (in 0.05M carbonate buffer, pH 
9.6) on Nunc-Immuno Plate IF MaxiSorp. 

2. Wash 4x with wash buffer (0.025% Tween 
25 20 in PBS pH 7.2) 

3. Add 50 ml assay buffer (0.04M MOPS, 
0.12M NaCl, O.OIM EDTA, 0.1% gelatin, 0.05% Tween 20, 
0.005% chlorhexidine digluconate, Leupeptin 50 mg/ml, pH 
7.4). 

30 4. Add 50 ml of standards and samples (in 

assay buffer) in triplicate. 

5. Add 50 ml 1/lOOk diluted a-lactalbumin 
biotin conjugate. 

6. Incubate overnight at 4C 
35 .7. Wash 4x with wash buffer 

8. Add 100 ml 1/lOk assay buffer diluted 
ExtrAvidin-peroxidase (Sigma) . Inctibate 2 hours at RT. 
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9. Wash 4x twice with wash buffer • 

10. Add 125 ml fresh substrate buffer (200 
ml tetramethylbenzidine 20 mg/ml) DMSO, 64 ml 0.5M 
hydrogen peroxide, 19.74 ml sodium acetate, pH 4.8). 

5 11. Incubate for 12 minutes at RT. 

12. Add 50 ml 0.5M sulfuric acid to stop 
substrate reaction. 

13. Read absorbance at 450 nra minus 600 nm 
in an EIA autoreader. 

10 Bovine a-lactalbumin was present at a 

concentration of levels up to and beyond 1.0 mg/ml mouse 
milk- Expression was determined by Western Blotting in 
the following steps. The 14% PAGE gel was transfered to an 
Immobilon-P membrane (Millipore) , which was blocked in 

15 0.02 M sodiumphosphate, 0.12M NaCl, 0.01% gelatin, 0.05% 

Tween 20/ pH=7.2, and incubated with anti-bovine 
a-lactalb\amin (1/2000 dilution) for 2 hoxirs at room 
temperature. The gel was washed twice (2 min.) with an 
• ELISA wash buffer and incubated with goat anti-rabbit 

20 IgG-HRP for 2 hours at room temperature, followed by 

washing 3 times with a wash buffer and washing once with 
double-distilled water. The gel was placed in a substrate 
solution (25 mg 3,3 '-diaminobenzidine, 1 ml 1% C0CI2 in 
H2O, 49 ml PBS pH 7.4 and 0.05 ml 30% H2O2) and monitored 

25 for color development. The membrane was air dried. 

It is understood that the invention is not 
confined to the particular constructions and arrangements 
herein illustrated and described, but embraces such 
modified forms thereof as come within the scope of the 

30 claims following the bibliography. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT: BLECK, GREGORY T. 

BREMEL, ROBERT D. 

5' (ii) TITLE OF INVENTION: DNA SEQUENCE ENCODING BOVINE 

ALPHA-LACTALBUHIN AND METHODS OF USE 

(ill) NUMBER OF SEQUENCES: 20 

(iv) CORRESPONDENCE ADDRESS: 
10 (A) ADDRESSEE: ANDRUS, SCEALES, STARKE & SAWALL 

(B) STREET: 100 E. WISCONSIN AVE., SUITE 1100 

(C) CITY: MILWAUKEE 

(D) STATE: WI 

(E) COUNTRY: USA 

15 (F) ZIP: 53202-4178 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

20 (D) SOFTWARE: Patentin Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

25 (viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Sara, Charles S 

(B) REGISTRATION NUMBER: 30,492 

(C) REFERENCE/DOCKET NUMBER: F. 3262*1 

(ix) TELECOMMUNICATION INFORMATION: 
30 (A) TELEPHONE: (608) 255-2022 

(B) TELEFAX: (608) 255-2182 

(C) TELEXs 26832 ANDSTARK 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



. 40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

ATGACCATGA TTACGAATTC ATCGTA 26 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 88 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
GAACAGTTAT CTAGATCTCG AGCTCGCGAA AGCTTGCATG CCTGCAGGTC GACTCTAGAG 60 
GATCCCCGGG TACCGAGCTC GAATTCAC ^® 
5 (2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2044 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE? DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: signal peptide coding region 

(B) LOCATION: 1943.. 2043 

15 (ix) FEATURE: , ^ ,w i 

(A) NAME/KEY: inherited control region for a-lactalbumln 

(B) LOCATION: 1966 

(ix) FEATURE: 

(A) NAME/KEY: putative steroid response element 
20 (B) LOCATION: 143 3.. 144 6 

(ix) FEATURE: 

(A) NAME/KEY: RNA polymerase binding region 

(B) LOCATION: 1961.. 1978 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

25 GATCAGTCCT GGGTGGTCAT TGAAACGACT GATGCTGAAG TTCAAGCTCC AATACTTTG6 60 

CCACCTGATG CGAAGAACTG ACTCATGTGA TAAGACCCTG ATACTGGGAA AGATTGAAGG 120 

CAGGAGGAGA AGGGATGACA GAGGATGGAA GAGTTGGATG GAATCACCAA CTOGATGGAC 180 

ATGAGTTTGA GCAAGCTTCC AGGAGTTGGT AATGGGCAGG GAAGCCTGGC GTGCTGC&GT 240 

CCATGGGGTT GCAAAGAGTT GGACACTACT GAGTGACTGA ACT6AACTGA TAGTGTAATC 300 

30 CATGGTACAG AATATAGGAT AAAAAAGAGG AAGAGTTTGC CCTGATTCTG AAGAGTTGTA 360 

GGATATAAAA GTTTAGAATA CCTTTAGTTT GGAAGTCTTA AATTATTTAC TTAGGATGGG 420 

TACCCACTGC AATATAAGAA ATCAGGCTTT AGAGACTGAT GTAGAGAGAA TGAGCCCTGG 480 

CATACCAGAA GCTAACAGCT ATTGGTTATA GCTGTTATAA CCAATATATA ACCAATATAT 540 

TGGTTATATA GCATGAAGCT TGATGCCAGC AATTTGAAGG AACCATTTAG AACTAGTATC 600 

35 CTAAACTCTA CATGTTCCAG GACACTGATC TTAAAGCTCA GGTTCAGAAT CTTGTTTTAT 660 

AGGCTCTAGG TGTATATTGT GGGGCTTCCC TGGTGGCTCA GATGGTAAAG TGTCTGCCTG 720 

CAATGTGGGT GATCTGGGTT CGATCCCTGG CTTGGGAAGA TCCCCTGGA6 AAGGAAATOG 780 

CAACCCACTC TAGTACTCTT ACCTGGAAAA TTCCATGGAC AGAGGAGCCT TGTAAGCTAC 840 
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AGTCCATGGG ATTGCAAAGA GTTGAACACA ACTGAGCAAC TAAGCACAGC ACAGTACAGT 900 

ATACACCTGT GAGGTGAAGT GAAGTGAAGG TTCAATGCAG GGTCTCCTGC ATTGCAGAAA 960 

GATTCTTTAC CATCTGAGCC ACCAGGGAAG CCCAAGAATA CTGGAGTGGG TAGCCTATTC 1020 

CTTCTCCAGG GGATCTTCCC ATCCCAGGAA TTGAACTGGA GTCTCCTGCA TTTCAGGTGG 1080 

5 ATTCTTCACC AGCTGAACTA CCAGGTGGAT ACTACTCCAA TATTAAAGTG CTTAAAGTCC 1140 

AGTTTTCCCA CCTTTCCCAA AAAGGTTGGG TCACTCTTTT TTAACCTTCT GTGGCCTACT 1200 

CTGAGGCTGT CTACAAGCTT ATATATTTAT GAACACATTT ATTGCAAGTT GTTAGTTTTA 1260 

GATTTACAAT GTGGTATCTG GCTATTTAGT GGTATTGGTG GTTGGGGATG GGGAGGCTGA 1320 

TAGCATCTCA GAGGGCAGCT AGATACTGTC ATACACACTT TTCAAGTTCT CCATTTTTGT 1380 

10 GAAATAGAAA GTCTCTGGAT CTAA6TTATA TGTGATTCTC AGTCTCTGTG GXCATATTCT 1440 

ATTCTACTCC TGACCACTCA ACAAGGAAGC AA6ATATCAA GGGACACTTG TTTTGTTTCA 1500 

TGCCTG6GTT GAGTGGGCCA TGACATATGA TGATGTACAG TCCTTTTCCA TATTCT6TAT 1560 

GTCTCTAAGA GGAAGGAGGA GTX6GCCGTG GACCCTTTGT GCATTTTCTG ATTGCTTCAC 1620 

TTGTATTACC CCT6AGGCCC CCTTTGTTCC TGAAATAG6T TGGGCACATC TT6CTTCCTA 1680 

15 GAACCAACAC TACCAGAAAC AACATAAATA AAGCCAAATG GGAAACA6GA TCAT6TTT6T 1740 

AACACTCTTT GGGCA6GTAA CAATACCTAG TATGGACXAG AGATTCTGGG GA6GAAAGGA 1800 

AAAGTGGGGT 6AAATTACTG AAGGAA6CTC AAT6TTTCTT TGTTGGTTTT ACTGGCCTCT 1860 

CTTGTCATCC TCTTCCTGGA TGTAAGGCTT GATGCCAGGG CCCCTAAGGC TTTTXCCACA 1920 

AAXAAAAGGA GGXGAGCAGX GXGGXGACCC CAXXXCAGAA XCXXGA66GG XAACCAAAAX 1980 

20 6AXGXCCXXX GXCXCXCXGC XCCXGGXAGG CAXCCXAXXC CAXGCCACCC AGGCXGAACA 2040 

GXXA 2044 

(2) INFORKAXION FOR SEQ ID NO: 4: 

(i) SEQUENCE CflARACXERISXICS : 
(A) LENGXH: 14 base pairs 
25 (B) xyPE: nucleic acid 

(C) SXRANDEDNESS: single 

(D) XOPOXXXSY: linear 

(ii) MOLECULE XYPE; DNA (genomic) 

(xi) SEQUENCE DESCRIPXION: SEQ ID NOt4t 
30 CAXAXXCXAX XCXA 14 

(2) INFORMAXION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACXERISXICS : 

(A) LBNGXHt 15 base pairs 

(B) XlfPE: nucleic acid 
35 (C) SXRANDEDNESS: single 

(D) XOPOLOGY: linear 

(ii) HOLECULE XYPE: DNA (genomic) 



1 
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15 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CATATTCTAT TCCTA 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 15 baee pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE5S: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

CATATTCTAT TTCTA 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

18 

20 TCTTGAGGGG TAACCAAA 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 
25 . (C) STRANDEDNESS: single 

<D) TOPOLOGY: linear 

(ii) MOLECtJLE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 

17 

TCTTGGGGGT AGCCAAA 
30 (2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
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TCTTGGGG66 TCACCAAA 18 
(2) IKFORHATION FOR SEQ ID NO: 10: 

(i) SEQUSNCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA- (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ACGCTTGTAA AACGACGGCC AGTTGATTCT CAGTCTCTGT GGT 43 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:ll: 
AGCATCAGGA AACAGCTATG ACCT6GGTGG CATGGAATAG GAT 43 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CTCTTCCT6G ATGTAAGGCT T 21 
(2) INFORMATION FOR SEQ ID NO: 13: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
TCCTGG6TG6 TCATTGAAAG GACT 
(2) INFORMATION FOR SEQ ID NO: 14: 



24 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CAATGTGGTA TCTGGCTATT TAGTC 
(2) INFORMATION FOR SEQ ID NO: 15: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

AGCCTGGGTG GCATG6AATA 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

GAAACGCGGT ACAGACCCCT 2° 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 46 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
35 AGGAAGCTCA ATGTTTCTTT GTTGGTTTTA CTGGCCTCTC TTGTCA 46 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18; 
AGGAGGCTAT TCTTTCCTTT TA6TCTATAC TGTCTTCGCT CTTCA 45 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 69 base pairs 

(B) TYPE! nucleic acid 

(C) STRANDEDNESS: single * ' 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
TATAAGAAAT CAGGCTTTAG A0ACTGAT6T AGAOAGAATG AGCCCTGGCA TACCAGAAGC 60 
TAACAGCTA • 59 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
TCTCAGAAAT CACACTTTTT TGCCTGTGGC CTTGGCAACC AAAAGCTAAC ACATA 



55 
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CLAIMS 

What is claimed is: 

1. A mammary specific DNA sequence encoding 
bovine a-lactalbumin and promoting quantitative 
differences in gene expression among mammals, wherein the 
DNA sequence is characterized by variations in the gene 

5 structure in the control region of bovine a-lactalbumin. 

2. The DNA sequence of claim 1 wherein one of 
the variations is in the -13 position of the DNA 
sequence . 

3. The DNA sequence of claim 2 wherein the -13 
position is occupied by adenine. 

4. The DNA sequence of claim 1 comprising the 
following DNA sequence (SEQ ID NO: 19) in the control 
region of bovine a-lactalbumin: 

TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACCAGA 

5 AGCTAACAGCTA. 

5. The DNA sequence of claim 1 comprising the 
following DNA sequence (SEQ ID NO: 17) in the control 
region of bovine a-lactalbumin: 

AGGAAGCTCAATGTTTCTTTGTTGGTTTTACTGGCCTCTCTTGTCA. 

6. The DNA sequence of claim 1 comprising the 
DNA sequence listed in Fig. 5 (SEQ ID NO: 3) in the 
control region of bovine a-lactalbumin. 

7. An expression vector comprising the DNA 
sequence of claim 1. 

8. An expression system comprising a mammary 
specific a-lactalbumin control region construct which 
when genetically incorporated into a mammal permits the 
female species of that mammal to produce the desired 

5 recombinant protein in its milk^ 

9. The expression system of claim 8 which 
comprises at least one a-lactalbumin control region 
construct operatively linked to a DNA sequence coding for 
a signal peptide and a-lactalbumin. 
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10. The expression system of claim 8 which 
comprises a 3 ' flanking region downstream of the DNA 
sequence coding for a-lactalbumin, 

11. The expression system of claim 10 wherein the 
construct includes a 5* flanking region upstream of the 
DNA sequence coding for the signal peptide. 

12. The expression system of claim 8 wherein the 
construct comprises a 5' a-lactalbumin flanking region 
attached to a bovine B-casein gene, 

13. The expression system of claim 12 wherein the 
construct contains the polyadenylation site of B-casein 
and approximately 100 base pairs of 5' a-lactalbumin 
flanking region. 

14. The expression system of claim 12 wherein the 
construct includes the proximal promoter from a first 
milk protein and the distal control region of a second 
milk protein. 

15. The expression system of claim 14 wherein the 
first and second milk proteins are selected from the 
group consisting of a-lactalbiuain, B-casein, as^^-casein, 
as^-casein and X-casein. 

16. The expression system of claim 12 wherein the 
construct includes the proximal promoter of fi-casein and 
the distal control region of a-lactalbumin. 

17. A genetically engineered mammal characterized 
by an expression system comprising an a-lactalbumin 
control region operatively linked to an exogenous DNA 
sequence coding for a desired protein to be expressed in 
milk through a DNA sequence coding for a signal peptide 
effective in secreting and maturing the protein in 
mammary tissue. 

18. The genetically engineered mammal of claim 17 
wherein the a-lactalbumin control region includes a 
mammary specific DNA sequence encoding bovine a- 
lactalbumin and having the following nucleotide sequence 
(SEQ ID NO: 19) in the control region of bovine o- 
lactalbumin: 
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TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACCAGA 
AGCTAACAGCTA. 

3^9^ The genetically engineered mammal of claim 17 

wherein the a-lactalbumin control region includes a 
mammary specific DNA sequence encoding bovine a- 
lactalbumin and having the following nucleotide sequence 
5 (SEQ ID NO: 20) in the control region of bovine a- 

lactalbumin: 

AGGAAGCTCAATGTTTCTTTGTTGGTTTTACTGGCCTCTCTTGTCA- 

20. Products produced by the genetically 
engineered mammal of claim 17. 

21. Semen produced by the genetically engineered 
mammal of claim 17 . 

22. Milk produced by the genetically engineered 
mammal of claim 17. 

23. - A transgenic mammal of claim 17. 

24. A DNA sequence coding for a-lactalbumin which 
is operatively linked in an expression system of a 
mammary specific a-lactalbumin protein control region, or 
any control region which specifically activates a- 

5 lactalbumin in milk or in mammary tissue, through a 

signal peptide that permits secretion and maturation of 
the a-lactalbumin in the mammary tissue. 

25. The DNA sequence of claim 24 comprising the 
following DNA sequence (SEQ ID NO: 19) in the control 
region of bovine a-lactalbumin: 

TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACC&GA 

5 AGCTAACAGCTA. 

26. The DNA sequence of claim 24 comprising the 
following DNA sequence (SEQ ID NO: 20) in the control 
region of bovine a-lactalbumin: 

AGGAAGCTCAATGTTTCTTTGTTGGTTTTACTGGCCTCTCTTGTCA. 

27. The DNA sequence of claim 24 comprising the 
DNA sequence (SEQ ID NO: 3) listed in Fig. 5 in the 
control region of bovine a-lactalbumin. 

28. A process for genetically engineering the 
incorporation of one or more copies of a construct 
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coroprising a a-lactalbumin control region or any control 
region sequence specifically activated in mammary tissue, 
operatively linked to a DNA sequence coding for a desired 
recombinant protein through a DNA sequence coding for a 
signal peptide that permits the secretion and maturation 
of a-lactalbiimin in the mammary tissue 

29. The process of claim 28 wherein the construct 
is genetically incorporated into a mammal and the 
recombinant protein product is subsequently expressed and 
secreted into or along with the milk of the lactating 
genetically engineered mammal. 

30. The process of claim 29 wherein the construct 
is generally incorporated in mammalian embryos^ mammalian 
mammary glands, or mammalian stem cells. 

31. The process of claim 28 wherein the mammals 
are cows; sheep, goats, mice, oxen, camels, water 
buffaloes, llamas and pigs. 

32. A process for the production and secretion 
into mammal's milk of an exogenous recombinant protein 
comprising the steps of: 

a. producing milk in a genetically 
engineered mammal characterized by an 
expression system comprising a-lactalbumin 
control region operatively linked to an 
exogenous DNA sequence coding for the 
recombinant protein through a DNA sequence 
coding for a signal peptide effective in 
secreting and maturing the recombinant 
protein in mammary tissue; 

b. collecting the milk; and 

c. isolating the exogenous recombinant 
15 protein from the milk. 

33 « The process according to claim 32, wherein 

said expression system also includes a 3* flanking region 
coding for a-lactalbumin downstream of the DNA sequence. 
34. The process according to claim 32, wherein 

said expression system also includes a 5' flanking region 
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coding for the signal peptide upstream of the DNA 
sequence. 

35. A selection characteristic for identifying 
superior milk producing mammals comprising a DNA sequence 
encoding bovine a-lactalbumin and having the DNA sequence 
(SEQ ID NO: 3) listed in Fig. 5 in the control region of 
bovine a-lactalbumin. 

36. A selection characteristic for identifying 
superior milk producing mammals comprising inherited 
genetic material which is a mammary specific DNA sequence 
encoding bovine a-lactalbumin and promoting quantitative 
differences in gene expression among mammals, wherein the 
DNA sequence is characterized by variations in the gene 
structure in the control region of bovine a-lactalbumin i 

37. The selection characteristic of claim 36 
wherein one of the variations is in the -13 position of 
the DNA sequence. 

38. The selection characteristic of claim 37 
wherein the -13 position is occupied by adenosine. 

39. The selection characteristic of claim 36 
comprising the following DNA sequence (SEQ ID NO: 19) in 
the control region of bovine a-lactalbumin: 

TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACCAGA 
AGCTAACAGCTA. 

40. The selection characteristic of claim 36 
comprising the following DNA sequence (SEQ ID NO: 20) in 
the control region of bovine a-lactalbumin: 
AGGAAGCTCAATGTTTCTTTGTTGGTTTTACTGGCCTCTCTTGTCA . 

41. The selection characteristic of claim 36 
comprising the DNA sequence (SEQ ID NO: 3) listed in Fig. 
5 in the control region of bovine a-lactalbumin. 

42. A method of predicting superior milk and milk 
protein production in mammals comprising comparing 
selected positions on the DNA sequence of the inherited 
control region for a-lactalbumin in a subject mamm£unl 
with analogous positions on the DNA sequence of the 
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control region for a-lactalbumin from mammals known for 
superior milk and milk protein production. 

43. The method of claim 42 wherein one of the 
selected positions is the -13 position on the control 
region of the DNA sequence. 

44. The method of claim 43 wherein the -13 
position is occupied by the base adenine. 

45. The method of claim 42 wherein the selected 
DNA sequence comprises a steroid response element. 
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ANE10)£D CLAIMS 

[received by the International Bureau on 14 December 1992 (14 .12 .92); 
original claims 2,5,9-11,19,24-31,33,35-45 cancelled; 
original claims 1,3,6,8,12,13,15,17,18,20-23,32 amended; 
other claims unchanged ( 4 pages ) ] 

1. An isolated DNA sequence which promotes mammary 
specific expression of mRNA in mammary cells of lactating animals 
comprising a variant of bovine a-lactalbumin 5' flanking region 
regulatory sequence wherein the variant is in the -13 position 
from the start of the signal peptide coding region. 

3. The ONA sequence of claim 1 wherein the -13 
position is occupied by adenine. 

4. The DNA sequence of claim 1 comprising DNA sequence 
(SEQ ID NO: 19) in the control region of bovine a-lactalbumin: 
TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACCAGAAGCTA 
ACAGCTA. 

6. The DNA sequence of claim 1 comprising the DNA 
sequence listed in Fig. 5 (SEQ ID NO: 3) as the 5' flanking region 
of bovine cr«-lactalbiamin. 

7. An expression vector comprising the DNA sequence 
of claim 1. 

8. An expression system comprising a mammary specific 
a*lactalbumin control region construct operatively linked to a 
DNA sequence coding for a signal peptide, which when genetically 
incorporated into a non-htunan mammal permits the female species 
of that mammal to produce the desired recombinant protein in its 
milk. 
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12. The expression system of claim 8 wherein the 
construct comprises a 5' a-lactalbumln flanking region attached 
to a bovine 6-caseln DNA sequence. 

13. The expression system of claim 12 wherein the 
construct contains the polyadenylatlon site of B-caseln 5' 
flanking region. 

14. The expression system of claim 12 wherein the 
construct Includes the proximal promoter from a first milk 
protein and the distal control region of a second milk protein. 

15. The expression system of claim 14 wherein the 
first and second milk proteins are selected from the group 
consisting of a-lactalbumin, 6-lactoglobulln, fi-casein, aS|-* 
casein, as2-caseln and X*-casein. 

16. The expressions system of claim 12 wherein the 
construct Includes the proximal promoter of fi-caseln and the 
distal control region of a-lactalbumin. 

17. A transgenic non-human mammal containing an 
expression system comprising the DNA sec[uence listed in Fig. 5 
(SEQ ZD NO: 3) as the a-^lactalbumin control region operatlvely 
linked to an exogenous DNA secpience coding for a desired protein 
to be expressed in milk through a DNA sequence coding for a 
signal peptide effective in secreting and maturing the protein 
in mammary tissue. 

18. The transgenic non-human mammal of claim 17 
wherein the a-lactalbumln control region includes a mammary 
specific DNA sequence encoding bovine a-lactalbumin and having 
the following nucleotide sequence (SEQ ID NO: 19) in the control 
region of bovine a-lactalbumin: 

TATAA6AAATCA66CTTTA6A6ACT6AT6TA6A6A6AAT6A6CCCT66CATACCA6AA6CTA 
ACA6CTA. 
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20. Products produced by the genetically engineered 
non-human mammal of claim 17 . 

21. Semen produced by the transgenic non*-human mammal 

of claim 17. 

22. Milk produced by the transgenic non-human mammal 
of claim 17. 

23. A transgenic non-human mammal of claim 17. 
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32. A process for the production and secretion into 
non-human mainmal's milk of an exogenous recombinant protein 
comprising the steps of: 

a. producing milk in a genetically engineered mammal 
containing an expression system comprising a 
mammsury specific a-lactalbumin control region 
operatively linked to a DNA secpience coding for 
a signal peptide, which is effective in secreting 
and maturing the recombinant protein in mammary 
tissue; 

b. collecting the milk; and 

c. isolating the exogenous recombinant protein from 
the milk. 

34. The process according to claim 32, wherein said 
expression system also includes a 5^ flanking region coding for 
the signal peptide upstream of the DNA secpience. . 
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